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1.0 INTRODUCTION 


It has become an increasingly emphasized desire of the managers of large scale 
computer centers to make objective, verifiable statements about computer perfor- 
mance and capacity. This desire has become more urgent as it has become more 
difficult of achieving. The complexity of operation that has made the intuitive 
concepts of computer performance unreliable has made the previously parttime art 
of computer evaluation a specialized discipline. 

In previous generations of computers, prior to processing multiple runs simultane- 
ously and configuring central processing units and peripherals with plug-in 
flexibility, performance evaluation was a simple consideration of runs processed 
per unit time. Sophisticates of the art dealt with CPU time and some sources of 
delay. To configure a system to a workload one considered average instructions, 
amounts of data, processor cycle times and output speeds. All tasks were pro- 
cessed serially, one after the other, and system impact was calculated by summing 
up the component times of a few prototype jobs. Systems were tuned by watching 
them run. 

Performance evaluation in the multiprogramming/multiprocessing generation is ut- 
terly transformed. At any moment, numerous runs are active within the computer, 
competing for services from all system components. The same run may compete 
simultaneously for different computer services. The impact of a run on system 
performance is a function of the total workload during the life of the run. The 
history of a program’s activity in the computer system is never exactly the same 
for any two executions. 

The Slidell Computer Complex (SCC) operates Univac 1108 computer systems in sup- 
port of batch and terminal applications. User requirements vary widely in terms 
of program size, processor requirements and mass storage usage. The environment 
is in every way typical of a large scale, open shop computer facility. 

The SCC conducts an ongoing analysis of U 1108 work flow to establish capacity 
estimates and to measure performance. A major goal has been to define the capa- 
city function in terms of two independent classes of variables -computer configuration 
and workload profile. It is recognized that variations in system performance result 
from changes in both the physical structure of the machine and the requirements 
structure of the workload. 

A number of approaches to performance evaluation have been considered at the SCC. 
Attaching electronic probe monitors to various critical system components is being 
considered. System performance has been monitored by a special software implemen- 
tation (Software Instrumentation Program - SIP). Regression analysis has been used 
to find linear relationships between CPU accumulations and selected measureable - 
parameters. Reasonable capacity estimates have been obtained from regression 
analysis but the equations are difficult to adjust for changing environments. It 
is not always apparent how the so-called independent variables respond to drastic 
shifts in workload and configuration. This shortcoming is fundamental. The re- 
lationship between meaningful independent variables and system performance is not 
expressable as a regression curve. Trend analysis fails when the trend changes. 

The SCC’s most recent performance evaluation tool, a U1108 performance model, 
considers the computer to be a network of service centers. The workload is con- 
ceived as a set of service requests. Each request is queued and processed under 
control of user programs and system software. Capacity is defined as the work 
level at which the network saturates. The configuration and workload are defined 
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In terms of Independent, predictable parameters. Queueing theory is used to 
calculate the work flow dynamics. Section 2.0 describes a brief, intuitive 
development of the theory. Section 3.0 describes the model. Section 4.0 is a 
detailed development of the numeric techniques used in the model. An example 
of model application is presented in Section 5.0. Section 6.0 is a user's 
guide to the computer program implementing the model and Section 7.0 presents 
the program listing. 


2.0 SERVICE QUEUES 

If a service center is busy at the time a request for services arrives , a wait 
period (or queue time) accrues. The average queue time for a series of requests 
can be estimated by queueing theory. 

Consider a service center as depicted below: 


Arriving Requests 


SERVICE 

CENTER 


Completed Requests 


Each service request has two attributes that determine its interaction with the 
service center: its arrival time and the amount of service requested. The 

service center's performance is determined by the number of servers (the number 
of simultaneous requests that it can serve) and the processing rate of each ser- 
ver. Estimation of these parameters allows calculation of the probability of an 
arrival in an arbitrary time period and the probability of all servers being 
busy at the time of an arrival. The probability of an arbitrary wait period may 
then be expressed and integrated with respect to time to yield the average wait 
time. 

To estimate the probability of an arrival in an arbitrary interval of time, two 
assumptions are made to simplify the calculations: 

1. The probability of an arrival in t seconds is proporational to t 
(i.e. the longer the wait for a service request, the greater the 
chances of receiving one). 

ii. The probability of more than one arrival in t seconds shrinks 
faster than t ( i.e. a rrivals are sequential and not clustered). 

These assumptions allow the probability of arrival to be expressed by the Poisson 
distribution: jj 

(at) ^ 

P (n arrivals in time t) » ns ^ 
where a is the average arrival rate. 

NOTE: The notation P(X) will be used to denote "the probability of event X". 
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similar considerations lead to an exponetial representation of the service rate. 

P (n requests serviced in time t) 
vhere b is the average service rate. 

Using these probability distributions, we can express the average queue time in 
terms of 

i. the average arrival rate, 
il. the average service rate, and 
lii. the number of servers. 

For the U1108 performance model, the number of servers is a computer configura- 
tion parameter. The average service rate is a function of workload and config- 
uration. The average arrival rate may be considered an independent variable in 
the queue calculation; for a given arrival rate, a determinable queue time results. 

If we assume that queued results are processed on a first-come first-served basis 
and that requests do not defect from the queue before being served, then a simple 
queue time calculation can be formulated. The derivation involves development of 
differential equations for two cases. 

case 1. There is no arrival in an arbitrarily small period of time, 
case 2. There is exactly one arrival in an arbitrarily small period. 

With the assumption of Poisson arrivals, these two cases are the only two possible 
since the arrivals do not cluster. Average queue time can be expressed as: 

-1 


QUEUE (A 


/_JL_W-Pfc_\ / pIc__ \ 

,B,C) = ^BC-Ay\C!(C-Py \^C!(D-P)y Z^\^(C-i)!/ 


if , and only 
if, BOA 


where A = average arrival rate 
B = average service rate 
C = number of servers 
P = A/B 

It should be noted that if A is greater than or equal to BC, the average queue 
time is infinite and the service center is saturated. That is, if the arrival 
rate exceeds the product of the service rate and the number of servers, the 
service center is overloaded. Capacity is conceived as the upper limit of 
arrival rates that do not exceed the service rate times the number of servers. 
Within a network of service centers, the capacity for the netr-jork is the lowest 
input rate which saturates one of the centers. 


3.0 WOKXFLOW MODEL 

To model the U1108 workflow, we wish to know what happens to a computer task (run) 
during its active life in the computer. We know that part of this time is spent 
in the service queues. Other delays occur that are related to the structure of 
the run and the state of the computer system. 
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We nay categorize this elapsed tine as: 

1. service time, 

11. service queue time, 

111. nemory queue time, 
u. voluntary delay time, and 
V. Involuntary delay time. 

Service time Includes the CPU time and the I/O traffic time. CPU time Is a func- 
tion of the Instruction sequence of the run and the CFU/maln memory cycle speed. 

I/O traffic time Is a function of data words transferred, record size, and the 
speed of the I/O device. Since a given run may have Its I/O requirements serviced 
by a variety of devices, each with Its own speed, the service time Is dependent on 
the probability of using a specific I/O device. These probabilities will be called 
the I/O traffic patterns. 

Service queue time is the wait period for CRJ and I/O traffic services. 

Memory queue time Is the wait period prior to receiving an allocation of main 
memory. This allocation must be long enough to encompass both the service and 
service queue times. 

Voluntary delay time includes periods vdien the run is temporarily requesting no 
services. Such delays typically occur on interactive runs input from demand 
terminals when the user is not transmitting requests. 

Involuntary delay time consists of periods when the run is prevented from making 
service requests. The usual cause is a request for I/O from a magnetic tape 
servo before a tape has been physically mounted. 

Runs, of course, do not accumulate elapsed time as might be implied by this cate- 
gorization, getting all the service queue time, then all the service time, then 
all voluntary delay and so forth. The actual history of a run may involve many 
small Increments of time in all of these categories. This organization of the 
elapsed time is important because it suggests a way to estimate it, not because 
It depicts a micro view of the life of a run. 

To calculate queue times we consider the U1108 computer to be a network of service 
centers. The network contemplates three major computer services viz central 
processor (CPU) service, I/O traffic service and main memory service. It assumes 
that a task is main memory resident during the time it is queued for and receiv- 
ing CPU and l/O services. The I/O traffic services are categorized by specific 
I/O device. 

Figure 1 Is a general schematic of the first part of the queueing network. As 
depicted, each I/O device (excluding unit record devices) is contemplated sep- 
arately. 

CPU and I/O requests flow to their respective service centers. Hie rate at which 
these services are requested, together with the rate at which CFJ and I/O queue 
time are accumulated, make up the memory service input rate. The schematic seems 
to turn the actual operation of the computer inside out. Runs actually receive 
Bvain memory allocation before CPU and I/O services. However, to calculate the 
main memory queue time, it is necessary first to calculate the CPU and I/O queues 
since this wait time is part of the main memory service request rate. 

• ■ TjUGIBILIf ¥ OF Txi* 

< ' PA(JiS IB POOR 
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The model also includes estimates of voluntary and involuntary delay time. These 
estimates plus service requests and queue times provide an average elapsed time 
estimate for a given work input rate. 

As depicted in Figure 2, this estimate of the elapsed time rate is used as input 
to the batch delay service center. This center simulates the operator's control 
over batch runs. A software valve controlled by a console key in prevents more 
than a specified number of batch runs becoming active at the same time. The 
batch delay queue estimates this unrecorded elapsed time and adjustments are 
made to the elapsed time estimate. 


4.0 MODEL MATHEMATICS 

The mathematics used in the model assume that the work input rate, the computer 
configuration and the workload profile are given. Performance parameters are 
computed from these three major variables. 

4.1 WORKLOAD INPUT RATE 

The operating system of the U1108 computer calculates an estimate of service 
requirements called the Standard Unit of Processing (SUP). The SUP accumulates 
the CPU time used by a run and estimates the I/O time. Taken collectively for 
all runs processed in a unit period of time, the SUP provides an estimate of the 
total service requirements. 

The accuracy of the SUP estimate is variable. CPU time is taken from the internal 
clock and is an accurate measure of the requirements of a run except that all 
functions of the operating system are not included. The I/O time is estimated, 
based on words transferred, average access time and transfer times. The estimate 
assumes that I/O occurs on the mass storage device requested by the run even 
though another physical device may have been substituted by the operating system. 
The CPU and I/O time used to perform executive requests and execute control card 
functions are estimated from a table of fixed charges. The accuracy of these 
fixed charges may vary from run to run and it is also not apparent how much of 
the charge represents CPU time and how much I/O time. 

These accuracy problems not withstanding the SUP is the best available estimate 
of collective service requirements. Benchmark runs indicate that it is accurate 
enough. 

It is used by the model as the basic measure of performance. The computer input 
rate is expressed in terms of SUP hours per hour of effective computer time. 

Effective computer time is defined as the time the conputer produces output. It 
excludes downtime, idle time and the apparently productive time spent on runs 
which are active and, therefore, lost when a system failure occurs. 

4.2 WORKLOAD PROFILE 

The workload is profiled in terms of its impact on each element of the model. 
Specifically, the workload profile includes the following: 

1. Rg « the rate of CPU requirements expressed as CPU time per SUP. 
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, rate o£ I/O requirements expressed as words transferred per SUP 

3. P(n) ■ the probability a given I/O request occurs on device n. 

4. W(n) ■ the average words per I/O request for device n. 

5. D * rhe ratio of demand to batch runs. 

6 . * magnetic tapes requested per unit of effective time. 

7. Rp = the rate at which runs are initiated expressed as runs per SUP. 

IV 

4.3 COMPUTER CONFIGURATION 

The model definition of the configuration consists of the following: 

1. M = amount of main memory available to the user. 

2. Ng = the number of CPU's. 

3. Nj(n) = the number of I/O requests for device type n that may be 

processed simultaneously. 

4 . R^C*^) ® the average access time for device n. 

5 . Rj(n) = the transfer rate for device n. 

6 . Lg ®= the maximum batch runs allowed active simultaneously. 

4.4 CPU SERVICE 

For given SUP rate R 5 the rate at which CPU service is requested is Rg.Rg. 

The rate at which the CPU can theoretically provide service is one hour of CPU 
time per hour of effective time. We may use the mathematics of Section 2.0 to 
calculate the CPU queue time per unit of effective time as; 

Qj. = CPU QUEUE RATE = QUEUE (A,B,C) 

where: A = Rs*Rc 
B >= 1. 

C - Nc 

4.5 I/O SERVICE 

For SUP rate R 3 and device n, the rate at which service time is requested is: 

Ra(") 

A = Rs.Ri.P(n) (Rx(n) ) + w(n) 

As above, with B = 1. and C = Ni(n), 

Qj(n) » QUEUE RATE FOR DEVICE n = QUEUE (A,B,C) 


ft®RODUCffiMTf OF ® 
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4.6 MA.IN MEMORY SERVICE 


Btfore programs can be considered for CPU and I/O services, they must be resident 
in the main memory of the computer. The amount of memory required is equal to the 
program size and varies greatly from one task to the next* The time during which 
the memory allocation is required is estimated by the SUP total plus the CPU and 
1/0 queue times. 

Tasks do not normally receive a single block of memory residence time. Runs are 
removed from main memory end swapped for others based on a complicated priority 
aeheme. A single task may be swapped several times before it completes. 


We wish to estimate the amount of time that a task seeks but is unable to receive 
main memory. This is done by defining the main memory as a service center and 
calculating the queue time from the techniques in Section 2.0. The queue time 
ao calculated is the total wait time for memory including the hiatus prior to 
initial load and the portion of the swap-out periods that are due to memory com- 
petition. 


To calculate the memory queue, we must define the parameters A, B, and C from 
Section 2.0. Recall that A is the service center input rate and B is the service 
rate, C is the number of requests that can be serviced simultaneously. We have 
already mentioned that runs require main memory for the full SUP duration plus 
the CPU and I/O queue times. 

A • Rs + Qc + y^Ql(n) 

B » 1. 


C> the number of servers, may be translated as the number of programs that can be 
fit simultaneously Into the user's portion of main memory. This is clearly a 
function of the probability chat a program of given size will need main memory. 


Ihiz main memory run level parameter is estimated as: 
.MAX 

C ■ MAX mH(m) 


m » 1 

where MAX Is the maximum user memory available. 


In practice H(m) Is estimated by: 

^ SUPCm') 

H(m) Sr ^ 

where SUP(m) Is the SUP accumulation for- programs of size m and "SUP is the 
total SUP accumulation for all runs. 

4.7 VOLUNSARY DELAY 

Regression analysis has shown that voluntary delay time is almost exclusively 
due to user delays on demand runs. Regression curves have been developed to 
estimate the delay based on two variables, the number of batch and demand runs 
processed. These curves must be updated periodically. 
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4.8 INVOLUNTARY DELAY 


Regression analysis has shown that involuntary delay time Is primarily Incurred 
while magnetic tapes are mounted. Estimates are based on the number of tape 
mounts requested. Estimation coefficients must be updated periodically. 

4.9 BATCH DELAY TIME 

The batch delay valve may be considered a service center with an input rate 
equal to the rate at which elapsed time accumulates for batch runs, less the 
batch delay rate itself. The service rate is unity and the number of servers 
is the number of batch runs allowed to be active simultaneously (variable Lg 
in Section 4.1). That is: 

Bq - BATCH DELAY TIME - (^EUE (A,1.,Lb) 

idiere if D » ratio of demand to batch runs 

Di ■ Involuntary delay 

Dy " Voluntary delay 

» Memory queue 

ELAPSE * Rg + Qg (n)+Qj^ + Dj + Dy 

then 

A - (ELAPSE - Bq) (I-D) 

thus 


Bq » Queue (ELAPSE ^ BqjlyLg) 
is an implicit function of the form 
f(X) » X 

and may be solved by an iterative technique. The program implementing this model 
uses a Wegstein approximation to evaluate Bq. 

The memory queue for batch runs is reduced by the batch delay queue since batch 
runs accumulate time behind the batch delay valve instead of in the memory queue. 


5.0 AN EXAMPLE 

Discussing the theoretical basis for the model does not suggest the way it is 
used in analyzing computer performance. An example will accomplish this better 
than abstract arguments. 

The see has at this time, May 1976, three U1108 configurations. U1108-01 is a 
multiprocessing system having two central processors and 262K words of main 
memory. Direct access mass storage is provided by three types of device. There 
are 787K words available on a high speed drum system designated as ah FH432. A 
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lower speed drum device, FH1782, provides 8.4M words. A disc device, F8440, 
provides 240. 8M words. There are 24 tape drives available to the system. U1108-01 
supports interactive demand termin, Is, batch terminals, and batch processing sub- 
mitted from the machine room floor. 

System 1108-02 has only one processor and only 131K words of main memory. Mass 
storage is provided by 2.4M words of FH432 drum space and 88. IM words of a very 
low speed drum device called Fastrand. Twelve tape drives are available. The 
system is used to process batch runs submitted from the floor. 

The 1108-03 configuration includes a single processor and 262K words of main 
memory. There are 525K words of FH432, 4.2M words of FH1782, 137. 6M words of 
J8440, and 24 tape drives available. The 03 system processes batch runs sub- 
mitted both from the floor and from remote batch terminals. There are no demand 
(interactive) terminals connected to this system. 

For this example, we will investigate the effect of discontinuing the 02 con- 
figuration. How could the remaining equipment be best utilized? 

Conceptually, the analysis must define the workload and test alternative methods 
of processing it. Part of the workload definition should be to assess performance 
of the current configurations. Thus we have a benchmarking task to determine 
where we are, and an experimental task to assess alternatives. 

The operating system of the U1108 produces data intended for use in billing com- 
puter users. These accounting data provide an excellent workload profile. 

Tables A, B, and C present data for the three SGC U1108 configurations depicting 
a weeks actual work. While these profiles are not necessarily typical of future 
work, they will be so construed for this illustration. The workload for U1108-01 
is considered in t\^o parts since most demand terminal work is processed between 
0800 and 1600 hours, Monday through Friday. The profile of demand work is dis- 
tinctly different than the batch work. 

A few observations can be made from an inspection of the performance data. For 
example, the mass storage demands on the 02 system can be absorbed by the other 
two systems with a net increase of less than 57o each. The profiles of mass 
storage usage on the 01 and 03 systems indicate that this demand can be met 
without impairing operations. 

The main memory profiles show that the 02 system typically has greater memory 
demands than the other two: the average resident program is bigger. We also 

note that the heavy demand terminal support during the 0800-1600 period involves 
small programs. We probably won*t want to mix the large batch programs from the 
02 system with the small demand runs on the 01. 

The service requirements for all three systems can be seen in figures 3, 4, and 5 
which depict the SUP rate as a function of time. It is apparent that service 
requirements build during the 0800-1600 hour time period for the 01 and 03 sys- 
tems » We will want to provide this same response even after the work from the 
02 is absorbed. 

To benchmark the current configuration, the model was run using the actual work- 
loads depicted in tables A, B, and C and the actual system configuration. The 
results are tabulated in tables D, E, F, and G. 
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U1108-01 WORKLOAD 
WEEK ENDING 2 MAY 1976 

0800-1600 

Mon-Fri, 

THROUGHPUT 

Other 

Periods 

CPU Hours 

22.2 

46.2 

Executive Request Charge 

21.2 

17.2 

SUP Accumulation 

91.9 

126.8 

Voluntary Delay 

282.2 

65.4 

Elapsed Time Accumulation 

554.4 

342.9 


ACTIVITY 

Number of Runs Processed 

1120.0 

717.0 

Average Batch Runs Active 

2.2 

4.3 

Average Demand Runs Active 

12.5 

1.3 

Average Total Runs Active 

14.8 

5.5 

Average Runs Not in Main Memory 

8.6 

1.8 

PROCESSING TIME 

Total Time Not Idle 

40.0 

87.4 

Actual Productive Time 

39.2 

61.8 

Effective Productive Time 

37.5 

61.8 

System Failures 

0 

2.0 


I/O TRAFFIC PATTERNS 

Total Words Transferred 3,683,716,352.0 4,222,021,056.0 


Percent on FH432 

28.2 

13 

Percent on FH1782 

4.5 

4 

Percent on F8440 

48.6 

57 

Percent on Mag Tape 

18.7 

23 


FACILITIES USAGE 



Main Memory (Core Blocks) 



Average Available 

298 

314 

Average Used 

253 

223 

Percent of Time 50% Full 

96 

82 

Percent of Time 75% Full 

84 

55 

Percent of Time 907. Full 

51 

23 

Percent of Time 997. Full 

3 

2 

FH432 (Tracks) 



Average Available 

0 

0 

Average Used 

439 

439 

Percent of Time 50% Full 

100 

100 

Percent of Time 75% Full 

100 

100 

Percent of Time 90% Full 

100 

100 

Percent of Time 997. Full 

100 

100 


(Continued) 


TABLE A 

I^EFRODUCIBILITY OF THF 

12 (mginal pac® is poor 




0800-1600 Other 

Mon-Fri. Periods 


FY1782 (Tracks) 


Average Available 


397 

664 

Average Used 


4284 

4017 

Percent 

of Time 50% 

Full 

100 

100 

Percent 

of Time 75% 

Full 

98 

93 

Percent 

of Time 90% 

Full 

63 

20 

Percent 

of Time 99% 

Full 

14 

I 

F8440 (Tracks) 




Average 

Available 


44601 

34763 

Average 

Used 


89799 

99637 

Percent 

of Time 507o 

Full 

89 

75 

Percent 

of Time 75% 

Full 

34 

12 

Percent 

of Time 90%, 

Full 

2 

2 

Percent 

of Time 99% 

Full 

0 

0 

Tape Units 





Average 

Available 


7.9 

10.2 

Average 

Used 


16.1 

13.8 

Percent 

of Time 50%, 

Full 

91.0 

82.0 

Percent 

of Time 75%, 

Full 

31.0 

55.0 

Percent 

of Time 90% 

Full 

11.0 

23.0 

Percent 

of Time 99% 

Full 

5.0 

1.0 


Tapes Mounted 1485 1976 


• MAIN MEMORY PROFILE 

Percent of SUP Total Used by Programs 
Occupying: 


Core Blocks 



0-10 

.5 

.2 

10-20 

3.4 

1.6 

20-30 

38.7 

13.8 

30-40 

10.3 

6.7 

40-50 

9.9 

5.6 

50-60 

15.5 

10.0 

60-70 

10.2 

33.7 

70-80 

7.3 

13.8 

80-90 

1.3 

2.2 

90-100 

.7 

.1 

100-110 

.1 

1.1 

110-120 

.2 

1.1 

120-130 

.8 

1.1 

130-140 

1.1 

.6 

140-150 


.3 

150-160 


6.4 


(continued) 


Table A Cont. 
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Core Blocks (Cont.) 


Other Periods 


160-170 

170-180 

180-190 

190-200 

200-210 

210-220 

220-230 


Table A Cont. 
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w o in o <n o rs 


U1108-02 WORKLOAD 
WEEK ENDING 2 MAY 1976 


THROUGHPUT 


CPU Hours 

15.5 

Executive Request Charge 

3.9 

SUP Accumulation 

82.0 

Voluntary Delay 

1.5 

Elapsed Time Accumulation 

97.2 


ACTIVITY 


Number of Runs Processed 

77.0 

Average Batch Runs Active 

1.2 

Average Demand Runs Active 

0.0 

Average Total Runs Active 

1.2 

Average Runs Not in Main Memory 

.0 


PROCESSING TIME 


Total Time Not Idle 

111.8 

Actual Productive Time 

89.9 

Effective Productive Time 

82.7 

System Failures 

1.0 


I/O TRAFFIC PATTERNS 


Total Words Transferred 


1,752,952,368.0 

Percent on FH432 


10.0 

Percent on Fastrand 


79.4 

Percent on Mag Tape 


10.6 

FACILITIES USAGE 
Main Memory (Core Blocks) 

Average Available 


162 

Average Used 


134 

Percent of Time 50% 

Full 

85 

Percent of Time 75% 

Full 

84 

Percent of Time 907» 

Full 

72 

Percent of Time 99%. 

Full 

0 

FH432 (Tracks) 

Average Available 


406 

Average Used 


910 

Percent of Time 50%. 

Full 

100 

Percent of Time 75%. 

Full 

13 

Percent of Time 90%. 

Full 

0 

Percent of Time 99%. 

Full 

0 

Fastrand (Track) 

Average Available 


37506 

Average Used 


11646 

Percent of Time 50% 

Full 

0 

Percent of Time 75%. 

Full 

0 

Percent of Time 90% 

Full 

0 

Percent of Time 99% 

Full 

0 


TABLE B 
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FACILITIES USAGE <Cont.) 

T«pe Units 

Average Available 9.7 

Average Used 2.3 

Fercent of Time 50% Full 0.0 

Fercent of Time 75% Full 0.0 

Percent of Time 90% Full 0.0 

Percent of Time 997. Full 0.0 

Tapes Mounted 192.0 

MAIN MEMORY PROFILE 
Percent of SUP Total Used by Programs 
Occupying: 

Core Blocks 

0-10 34.1 

10-20 .2 

20-30 2.1 

30-40 .1 

40-50 .0 

50-60 7.6 

60-70 14.0 

70-80 .0 

80-90 .0 

90-100 .0 

100-110 .0 

110-120 .0 

120-130 2.0 

130-140 .0 

140-150 21.2 

150-160 18.8 


Table B Cont. 
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U1108-03 WORKLOAD 
WEEK ENDING 2 MAY 1976 

THROUGHPUT 

CPU Hours 51.7 

Eicecutive Request Charge 25.6 

SUP Accumulation 176.7 

Voluntary Delay 19.2 

Elapsed Time Accumulation 569.3 

ACTIVITY 

Number of Runs Processed 1301.0 

Average Batch Runs Active 5.7 

Average Demand Runs Active 0.0 

Average Total Runs Active 5.7 

Average Runs Not in Main Memory 1.9 

PROCESSING TIME 

Total Time Not Idle 123.0 

Actual Productive Time 115.3 

Effective Productive Time 99.4 

System Failures 1.0 

I/O TRAFFIC PATTERNS 

Total Words Transferred 6,723,282,496.0 

Percent on FH432 15.6 

Percent on FY1782 1.7 

Percent on F8440 ' 59.8 

Percent on Mag Tape 22.9 

FACILITIES USAGE 
Main Memory (Core Blocks) 

Average Available 318.0 

Average Used 260.0 

Percent of Time 507. Full 94.0 

Percent of Time 757. Full 76.0 

Percent of Time 907. Full 38.0 

Percent of Time 997. Full 2.9 

FH432 (Tracks) 

Average Available 0 

Average Used 293 

Percent of Time 507. Full 100 

Percent of Time 757. Full 100 

Percent of Time 90% Full 100 

Percent of Time 997. Full 100 

FH1782 (Tracks) 

Average Available 177 

Average Used 2164 

Percent of Time 507. Full 100 

Percent of Time 75% Full 98 

Percent of Time 907. Full 77 

Percent of Time 997. Full 10 

TABLE C 
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F8440 (Tracks) 

Average Available 

25936 

Average Used 

50864 

Percent o£ Time 50% Full 

79 

Percent of Time 75% Full 

32 

Percent of Time 90% Full 

3 

Percent of Time 997. Full 

0 

Tape Units 

Average Available 

9.3 

Average Used 

14.7 

Percent of Time 507. Full 

75 

Percent of Time 757. Full 

24 

Percent of Time 90% Full 

7 

Percent of Time 997. Full 

2 

Tapes Mounted 

3547 

MAIN MEMORY PROFILE 
Percent of SUP Total Used by Program 
Occupying: 

Core Blocks 

0-10 

.2 

10-20 

.6 

20-30 

11.9 

30-40 

7.9 

40-50 

7.6 

50-60 

23.5 

60-70 

24.2 

70-80 

13.4 

80-90 

1.1 

90-100 

1.1 

100-110 

1.8 

110-120 

1.3 

120-130 

.5 

130-140 

.8 

140-150 

1.2 

150-160 

1.9 

160-170 

0.0 

170-180 

0.0 

180-190 

.3 

190-200 

.5 

200-210 

' 0.0 

210-220 

.4 


Table C Cont. 
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U1108-01 MODEL BENCHMARK 
DAY SHIFT 

WORKLOAD FROM W/E 2 MAY 1976 



Runs 

SUPS 

I/O 

CPU 

Memory 

Batch 

Voluntary 

Involuntary 

Percent 


Per 

Per 

Queue 

Queue 

Queue 

Queue 

Delay 

Delay 

Saturation 


Hour 

Hour 

Per Hr. 

Per Hr. 

Per Hr. 

Per Hr. 

Per Hr. 

Per Hr. 


fz9,B 

2.45 

.0169 

.581 

.0317 

.096 

7.14 

2.98 

71 

31.6 

2.60 

.0212 

.741 

.0591 

.120 

7.58 

3.16 

75 


32.6 

2.67 

.0237 

.835 

.0809 

.134 

7.80 

3.25 

77 


33.4 

2.74 

.0263 

.941 

.1110 

.150 

8.01 

3.34 

80 


34.3 

2.82 

.0292 

1.050 

.1540 

.168 

8.22 

3.43 

82 


35.2 

2.89 

.0322 

1.190 

.2140 

.191 

8.43 

3.51 

84 


36.1 

2.96 

.0354 

1.340 

.3020 

.217 

8.64 

3.59 

86 


36.9 

3.03 

.0389 

1.510 

.4340 

.249 

8.85 

3.69 

88 


37.8 

3.10 

.0425 

1.700 

.6390 

.291 

9.05 

3.77 

90 


38.6 

3.17 

.0464 

1.910 

.9730 

.351 

9.25 

3.85 

92 


39.5 

3.24 

.0504 

2.160 

1.5600 

.441 

9.45 

3.94 

94 


40.5 

3.32 

.0558 

2.520 

3.2400 

.681 

9.70 

4.04 

96 


<Al,7 

3.42 

.0626 

3.040 

17.2000 

-- 

9.98 

4.16 

99 

1 > 

'42.1 

3.45 

.0650 

3.250 

74.4000 

— 

10.10 

4.20 

100 


Table E 


U1108-01 MODEL BENCHMARK 
NIGHT SHIFT 

WORKLOAD FROM W/E 2 MAY 1976 



Runs 

SUPS 

I/O 

CPU 


Per 

Per 

Queue 

Queue 


Hour 

Hour 

Per Hr. 

Per Hr 

11.6 

2.05 

.008 

.361 


12.0 

2.13 

.009 

.415 


12.5 

2.20 

.010 

.476 


12.9 

2.28 

.011 

.545 


13.6 

2.36 

.013 

.622 


13.8 

2.43 

.015 

.710 


14,2 

2.51 

.017 

.809 


14.7 

2.59 

.019 

.921 


15.1 

2.66 

.021 

1.050 


15.5 

2.74 

.024 

1.190 


15.9 

2.81 

.026 

1.360 


16.4 

2.89 

.029 

1.550 


/l6.7 

2.94 

.032 

1.720 


Memory 
Queue 
Per Hr. 

Batch 
Queue 
Per Hr. 

Voluntary 
Delay 
Per Hr. 

Involuntary 
Delay 
Per Hr. 

Percent 

Saturation 

.036 

.924 

.891 

2.40 

70 

.048 

1.070 

.924 

2.49 

72 

.063 

1.230 

.958 

2.58 

75 

.369 


.991 

2.68 

77 

.497 


1.020 

2.77 

80 

.678 


1.060 

2.86 

83 

.944 


1.090 

2.95 

85 

1.350 


1.120 

3.03 

88 

2.030 


1.160 

3.12 

91 

3.260 


1.190 

3.21 

93 

6.020 


1.220 

3.30 

96 

16.000 


1.250 

3.39 

98 

J31.000 


1.270 

3.45 

100 


. T-f ■■ ^ ;^j if-;-:, . 
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U 1108-02 MODEL BENCHMARK 
WORKLOAD FROM W/E 2 MAY 1976 



Runs 

SUPS 

I/O 

CPU 

Memory 


Per 

Per 

queue 

Queue 

Queue 

Hour 

Hour 

Per Hr. 

Per Hr. 

Per Hr. 


^ .94 

1.00 

.102 

.073 

0 


1.01 

1.08 

.132 

.088 

.671 


1.09 

1.16 

.168 

.104 

1.050 


1.16 

1.24 

.211 

.122 

1.730 


1.24 

1.32 

.263 

.142 

3.210 

A. 31 

1.40 

.325 

.164 

7.800 


/1.39 

1.48 

.399 

.188 

131.000 


Batch 

Voluntary 

Involuntary 

Queue 

Delay 

Delay 

Per Hr. 

Per Hr. 

Per Hr. 

I 00 

.020 

.175 


.021 

.189 


.023 

.203 


.024 

.217 


.025 

.231 


.027 

.245 


.028 

.259 



Percent 

Saturatlf 


68 

73 

78 

84 

89 

95 

100 




Table G 


U1108-03 MODEL BENCHMARK 
WORKLOAD FROM W/E 2 MAY 1976 








11.0 

12.5 

13.1 

/13.4 


SUPS 

I/O 

CPU 

Per 

Queue 

Queue 

Hour 

Per Hr. 

Per Hr. 

1.54 

.050 

1.39 

1.62 

.058 

1.72 

1.70 

.068 

2.15 

1.78 

.079 

2.72 

1.82 

.085 

3.09 


Memory 
Queue 
Per Hr. 

0 

.553 

1.38 

5.62 

29.9 


Batch 
Queue 
Per Hr. 


Voluntary 
Delay 
Per Hr. 


Involuntary 
Delay 
Per Hr. 

2.32 

2.44 

2.56 

2.68 

2.74 


Percent 

Saturation 






Looking first at ttie 111108-01 system and ttie heav 7 day snitt woricioad (Table D), 
notice the sudden buildup In the memory queue prior to the saturation level. It 
is the memory queue which overloads first, causing system saturation. The CPU 
queue is the second most critical while the 1/0 queue shows capacity still avail- 
able at system saturation. 

Recall that CPU and 1/0 queue times as well as the SUP rate are Included in the 
memory queue Input rate. Therefore, we may think of these three elements as 
causing memory saturation. The CPU queue buildup is critical since it tends to 
push the memory queue into a saturation condition. Notice that the CPU queue at 
the actual operating level is about 1/5 of the SUP rate while at the saturation 
level it is nearly equal to the SUP rate. This indicates that the CPU queue is 
the most important contributor to the overloading of the memory queue (given the 
program-size profile and memory availability actually experienced). 

A modeling distortion can be seen in the failure of the batch queue to saturate 
%t the actual operating level. Since the actual batch limit was used in running 
. the model, this queue should have saturated at the 71% level rather than the 99% 
level. This discrepancy is caused by the model assumption that the batch and de- 
mand work have identical profiles. 

It is incorrect to assume from Table D that it would have been feasible to operate 
the U1108-01 system at the rate of 3.45 SUPS per hour. While this would have been 
theoretically possible, it would have caused an increase of over 8000% in the queue 
time of each run. This degradation of response time in the demand terminal envi- 
ronment would have been intolerable. The tradeoff of SUP rate for queue time can 
be seen in figure 6. It is apparent that the actual operating level is nearly op- 
timum in terms of output gained per unit of delay. For this reason, and to be 
ccisezvative, we will assume that about 707. of saturation is optimum for the day 
shift U1108-01. 

Similarly, on the U1108-01 night shift, 70% saturation is taken as optimum. Note 
that the batch queue saturates closer to the actual operating level in Table E, 
indicating less demand influence on the total workload profile. As before, the 
memory queue is pushed into a saturation condition by the CPU queue (see figure 7). 

The U1108-02 system seems to be running under capacity during this timeframe (see 
figure 8) . An increase of 10% to 15% in the saturation level would effect the 
performance very little. It, too, is limited by the memory queue but the low 
speed Fastrand drums make the I/O queue more critical than on the other two sys- 
tems . 

The U1108-03 system appears to have been running at optimum capacity (see figure 
9). Again the memory queue is pushed to saturation by the CPU queue. 

From this analysis we conclude that the U1108-01 and U1108-03 systems were operated 
near optimum capacity during their effectively productive times in the test period. 

There are several approaches to assessing the effect of removing the U1108-02 
system. One way is to develop a composite workload profile from the work pro- 
duced by all three systems. This profile can then be tried against optional 
configurations. 

For example, ininning the composite workload against a U1108-01 configuration 
yields the results in table H. If we assume an optimum capacity at the 707. 
level, then it would be possible to produce 16.6 runs per hour. Recent studies 
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FIGURE 6 


U-1108-01. DAY SfflFT 
SUP RATE VS QUEUE RATES 



FIGURE 7 


U-1108-01 KIGHT SHIFT 
SUP RATE VS QUEUE RATES 





FIGURE 9 
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Table H 


U1108-01 COMPOSITE WORKLOAD 
WORKLOAD FROM W/E 2 MAY 1976 



Runs 

Per 

Hour 

SUPS 

Per 

Hour 

I/O 
Queue 
Per Hr. 

CPU 
Queue 
Per Hr. 

Memory 
Queue 
Per Hr, 

Batch 
Queue 
Per Hr. 

Voluntary 
Delay 
Per Hr. 

Involuntary 
Delay 
Per Hr. 

Percent 

Saturation 

®^Vl6.6 

2.26 

.016 

.399 

.050 

.855 

1.78 

2.78 

71 


17.1 

2.34 

.018 

.453 

.064 

.975 

1.84 

2.88 

73 


17.7 

2.42 

.020 

.513 

.084 

1.110 

1.90 

2.97 

75 


18.2 

2.49 

.023 

.580 

.109 

1.250 

1.96 

3.07 

78 


18.8 

2.57 

.026 

.654 

.416 


2.02 

3.16 

80 


19.4 

2.64 

.029 

.738 

.553 

— 

2.08 

3.25 

82 


19.9 

2.72 

.032 

.831 

.746 

— 

2.14 

3.34 

85 


20.5 

2.79 

.036 

.936 

1.030 

— 

2.20 

3.44 

87 


21.0 

2.87 

.040 

1.050 

1.460 

— 

2.26 

3.53 

89 


21.5 

2.97 

.044 

1.180 

2.160 

— 

2.31 

3.62 

92 


22.1 

3.01 

.049 

1.330 

3.450 

-- 

2.37 

3.71 

94 


> 22.6 

3.08 

.053 

1.500 

6.330 

-- 

2.43 

3.80 

96 


^ 23.1 

3.16 

.058 

1.690 

16.500 

— 

2.48 

3.89 

98 


/23.5 

3.21 

.063 

1.850 

148.000 

-- 

2.53 

3.95 

100 


n 

’ il 

f i 

! i 



indicate that effective productive time is about 857o of non-idle time (allowing 
for downtime and PM). There were 3215 total runs produced in the test period. 

At 16.6 runs per hour and 6.8 effective hours per shift, 28.5 shifts would be 
needed to perform the work. IWo U1108-01 configurations operating 15 shifts per 
week could accomplish the work of the test period. 

Even if the U1108>01 machine were able to reach its theoretical maximum of 23.5 
runs per hour, it would require over 20 shifts of operation to complete the work. 
Thus, we may conclude that two U1108-01 configurations could have handled the 
work but one could not. 

The model results of running the composite workload on the Ull08*03 system are 
depicted in table I. If we set the expected operating level at the 85% of sat> 
uration point, as seen in the benchmark, then we would expect to produce about 
10.8 runs per hour. Reasoning as for the U1108-01 we would conclude that 44 
shifts of U1108-03 operation would be required by the test workload. This equates 
to about three such machines operating all day five days per week. 

ye may also conclude that together the U1108-01 and U1108>03 configurations would 
produce about 27.4 runs per hour and that each would require about 18 shifts of 
operation per week to complete the 3215 runs of the test period. 


5.1 EXAMPLE CONCLUSION 

The most obvious options available with existing hardware if the U1108-02 system 
were not available are; 

1. To accomplish the work with the remaining 2 systems unchanged; 

2. To acquire 262K words of additional main memory and reconfigure the CPU's 
into three unit processor systems similar to U1108-03; 

3. To reconfigure the three processors into a single, three-CPU system; and 

4. To acquire another processor and configure two, dual-CPU systems similar 
to U1108-01. 

Of these we have seen that option 1 could not have accomplished the workload of 
the test period without weekend work. Options 2 and 4 accomplish the work within 
the 15 shifts of the standard work week. To test option 3 the composite workload 
was tested against the U1108-01 configuration modified to include 3 processors. 
The expected operating level of this configuration was 21,5 runs per week. Thus, 
a triple CPU configuration with maximum main memory would require about 22 shifts 
to complete the test period work. One such system would not be adequate. 

Of the two feasible options, number 2 is the cheapest to implement. The expected 
operating levels of the two options do not differ significantly (33.2 runs per 
hour for two dual processors versus 32.4 for three unit processors - well within 
any reasvonable estimate of the model error). The big question would concern the 
heavy demand workload during the day shift period. How many of the unit proces- 
sors would be required to handle the day shift work now accomplished by U1108-01 
and would the response times be adequate? 

To answer these questions, the day shift workload profile from U1108-01 was 
tested against the U1108-03 configuration. The expected run level turned out 
to be 16.6 runs per hour indicating about 10 shifts would be required to accom- 
plish the test period load of 1120 runs. This means two of the unit processors 
would have to be dedicated to the U1108-01 day shift work. 
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Table I 


U 1108-03 COMPOSITE WORKIOAD 
WORKLOAD FROM W/E 2 MAY 1976 


Runs 

SUPS 

I/O 

CPU 

Per 

Per 

Queue 

Queue 

> Hour 

Hour 

Per Hr. 

Per Hr. 


1.46 

.033 

1.50 

11.4 

1.54 

.039 

1.88 

, 12,0 

1.62 

.046 

2.40 

12.6 

/l2.7 

1.70 

.053 

3.12 

1.72 

.055 

3.35 


Memory 
Queue 
Per Hr. 


Batch 
Queue 
Per Hr. 


Voluntary 
Delay 
Per Hr. 


Involuntary 
Delay 
Per Hr. 


Percent 

Saturation 


.085 

.200 

1.640 

11.300 

48,900 


' . 


memory queues combined - excluding the batch delay queue) accrued per unit of 
elapsed time. This will give us a feeling for the rate at which runs are delayed 
because of the system load. For example. If we find queue time accuing at the 
rate of % second per second of active run time, and If the operator of a demand 
terminal made a request every S secor'^<«. then processing of his requests would 
be delayed an average of 2% seconds. 

The day shift workload accrued .043 seconds of delay per second of elapsed time 
on the dual processor and .144 seconds per second on the unit processor. Thus, 
we could expect response time to about triple. We get the same relative answer 
but a different absolute concept of the response time If we look at queue time 
as a quotient of total seir^rlce time. The dual processor accrues about .25 seconds 
of delay per SUP second while the unit processor would accrue about .87 seconds 
per second. Again, the response time triples. 

As was mentioned at the beginning of this section, It Is not the Intent to develop 
rigorously an argument for any particular reconfiguration of the SCC computers. 
These examples are Intended for Illustrative effect. A thorough analysis would 
require a better development of the projected workload. There Is no assurance 
that the workload of the week ending 2 May 1976 Is representative of anything to 
be seen In the future. We would also require a more careful definition of the 
hypothetical configurations . 

5.2 MODEL ACCURACY 

The question of model accuracy occurs at this point as we wonder about the validity 
of the various performance estimates cited in this section. Accuracy estimates may 
be made from benchmark runs. 

Comparing the model estimate of the elapsed time with the actual elapsed time 
accrual provides an accuracy estimate. Although several months of data should be 
benchmarked before any conclusive statement is made, so far the model has esti- 
mated elapsed time closely (within about 107o) . 

The batch delay queue can also be used to determine the accuracy of the queue 
time estimates. We know that this queue, unlike the others, operates at the 
saturation level. That is, the number of batch runs active is equal to the batch 
run limit set by the console operator. This is true because the batch run backlog 
is almost never empty. 

Thus, if the model is calculating queue time correctly and if the SUP is repre- 
sentative of service requirements, the batch delay queue should saturate at the 
actual operating level. As has been pointed out, this happens for the two sys- 
tems that run solely batch work but does not for the U1108-01 which runs both 
demand and batch. 

The batch delay queue does not saturate on the U1108-01 model test at the correct 
level because no allowance is made for the differences between the batch and de- 
mand workload profile. This principle can be used to predict the profile of the 
U1108-01 batch work. On the day shift, for example, an inspection of the data 
in Table A indicates that the batch delay queue would have saturated at the 
proper level if batch work had accumulated .49 hours of elapsed time per run and 
required about .3 SUP hours per run. These happen to be the attributes of the 
work processed on the U1108-01 night shift which consists mostly of batch runs, 
leading to the observation that the batch delay queue seems accurate. 
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while this denionstration is not conclusive, it suggests a means o£ determining 
model accuracy. Confidence can be gained only over a period of extended use. 

A final comment having great intuitive appeal on model accuracy will be given. 

When the U1108-03 benchmark test was first made, prior to the test results pre- 
sented in this paper, it was noticed that the batch delay queue saturated before 
the supposed actual operating level. The model results were consistent with a 
data set that had accrued approximately 85 hours more of elapsed time than had 
apparently been experienced in the test period. A check was made and it was 
found that a program bug in the data collection routine had caused an under- 
statement of the elapsed time amounting to 83 hours. The model was right; the 
data was wrong. 

This example is admittedly melodramatic, but interesting. 

Model accuracy depends on: 

1. The accuracy of the queue calculations, 

2. The accuracy of the service requirement estimates, and . 

3. The accuracy of the model assumptions. 

Of these conditions, the most questionable is the second: service requirements 
estimates. The SUP does not state the exact system service load. The CPU charge 
does not include the total processor load. It is not apparent how much of the 
executive request charge is CPU time and how much is I/O. Preliminary indications 
are that the model is highly accurate and that current methods of estimating the 
service requirements are close enough for practical use. Experience with the 
model will allow development of a better accuracy estimate. 


6.0 MODEL IMPLEMENTATION 

A computer program implementing the model has been written in the FORTRAN V 
language to operate on the Univac 1108 computer under the EXEC VIII operating 
system. This program estimates accumulated elapsed time and other throughput 
parameters for input loads up to the system saturation level. Estimates are 
based on a specified workload profile and configuration definition. 

6.1 STRUCTURAL OVERVIEW 

The program is collected as one absolute link with no overlays. There is a main 
program and 8 external subprograms. The calling sequence is as depicted in 
Figure 10. All subprograms have one entry print designated by their respective 
names. 


6.2 FUNCTIONAL OVERVIEW 

The main program reads the configuration and workload definitions from a namelist 
called $INPUT, All performance parameters are calculated and the output reports 
are written. DELAYS calculates the voluntary and involuntary delay estimates; 
MEMUTL calculates the memory utilization estimate; QUEUE calculates all queue 
time estimates; and TMSWAP is an experimental subroutine estimating the tiirfe 
required to swap programs in and out of main memory. WEGIT is a MATHPAC routine 
used for solving an implicit function by iterations. WAIT is used in calculating 
queue times and PHAT is part of the experimental time-to-swap code. GAM'IA is 
another MATHPAC routine used to evaluate the Gamma or factorial function. 
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6.3 LOGIC FLOW AND MATHEMATICS 


6.3.1 Main Program 

Hie program reads a namelist called $INFUT. The input parameters are as depicted 
in table J. 

The namelist is written to the standard print file for checking. 

The number of words transferred is used to calculate tlie 1/0 time based on the 
device specifications and the I/O traffic patterns. The SUP rate is set to an 
initial value of .1 SUPS per hour and incremented by ,02 SUPS per hour with 
each iteration. 

In the main loop where elapsed time parameters are calculated, the input to the 
queue calculations is prepared. All parameters are converted to a rate per unit 
of effective productive time. 

A call to DELAYS calculates the voluntary and involuntary delay time. 

A call to TMSWAP calculates the time required for swap activity and the number 
of swaps per hour. 

The CPU queue time is calculated by a call to QUEUE using the CPU time plus the 
executive request time as the input rate. Hiis assumes ti-at all executive re- 
quest time is spent on the processor. It also assumes that these two items are 
exhaustive of CPU requirements. Heither assumption is entirely correct but 
recent system audits using SIP indicate this technique yields a reasonable es- 
timate of CPU requirements. 

The I/O queues are calculated for each device' type. In this case, the input 
rate to the queue calculation is the time required to transfer the words indi- 
cated iu the workload profile. 

The memory queue is calculated using the SUP rate and the total queue rate as 
the input rate. 

To calculate the batch delay queue, the input rate is taken as the SUP rate plus 
the memory queue plus voluntary and involuntary delay time less the batch qxieue 
itself. This implicit function is solved by an iterative technique using a 
Wegstein approximation. The input rate to the batch delay queue assumes that 
batch runs have the same profile as demand runs. This assumption is made in 
all categories of elapsed time except voluntary delay. The correct voluntary 
delay estimate for batch work is used. Since batch work has different service 
requirements than deamnd work, this assumption leads to some distortion of the 
batch delay quetxe when demand work is present. 

The batch delay queue is subtracted from the batch portion of the memory queue 
since runs do not accumulate memory wait time while detained by the batch delay 

valve. 

Output parameters are set up and written to an output file. One report is writ- 
ten directly to th , standard output file and other parameters are written to an 
alternate file. 
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MAME 

DIMENSION 

TYPE 

DESCRIPTION 

DNITS 

ACCESS 

10 

Real 

Average access time for up to 10 
device types. 

seconds 

XFER 

10 

Real 

Average transfer rate for up to 
10 device types. 

words/ sec. 

MEMORY 

1 

Integer 

Amount of user accessible stain 
memory. 

core blocks 

SERV 

10 

Real 

Number of independent I/O paths 
for each device type. 


NUMUMT 

1 

Integer 

Number of I/O device types. 


MUMCPU 

1 

Real 

Number of CPU's. 


ISWAP 

1 

Integer 

Index of the device 


ISWAP 

1 

Integer 

Index of the device type contain- 
ing swap files. 


USEAGE 

10 

Real 

I/O traffic patterns 

Percent of words 

WORDS 

1 

Real 

Words transferred per run. 

Words/run 

ELR 

1 

Real 

Elapsed time accumulated per run. 

Hrs/run 

CFUW 

1 

Real 

CPU time per word 

Hrs/word 

ERCC 

1 

Real 

Ratio of executive request 
charge to CPU time. 

ERCC/ CPU 

VDR 

1 

Real 

Voluntary delay per run. 

Hrs/run 

SIZE 

1 

Real 

Average main memory requirements 
per run. 

Core blocks 

DEMFER 

1 

Real 

Percent of runs that are demand 
runs. 


TAPR 

1 

Real 

Tape mounts per run 

Tape/ run 

RUNLVL 

1 

Real 

Average limit of number of runs 
resident in main memory. 


BATLIM - 

1 

Real 

Maximum batch runs active. 



Table J 
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when the batch delay queue saturates, its value is set to zero for subsequent 
input levels. When any other queue saturates, the system is assumed to be sat- 
-urated. A diagnostic is written and the Incrementing of the SUP rate stops. 

The output parameters on the alternate file are written to the standard print 
file. 

6.3.2 DEIAYS (TIPMNT, BATCH, DEMAND, VOLDLL, INVLL) 

This subroutine calculates: 

VOLDLL: The voluntary delay estimate, and 
INVLL: The involuntary delay estimate, 

based on 

TIPMNT: The number of tape mounts, 

BATCH: The number of batch runs, 

DEMAND: The number of demand runs. 

Regression curves are used to calculate the two forms of delay. 

6.3.3 MEMUTL (MEMSUP, SUPRAT, TOTQ) 

This function calculates the memory utilization based on 

MEMSUP: the SUP weighted run size, 

SUPRATE: the SUP rate per hour, 

TOTQ: the total queue time. 

Althcugh the calculation is trivial, it is contained in a separate subprogram 
because of plans to modify the model to estimate actual memory residency. 

6.3.4 TMSWAP 

This experimental subroutine is not yet completed. 

6.3.5 PHAT 

This experimental subroutine is not yet complete. 

6.3.6 QUEUE (A, B, C) 

This function calculates the average queue time based on the mathematics of 
Section 2.0. When a queue saturates, the value of QUEUE is set to -1. 

The GAMMA function is used to calculate the factorial function. 

6.3.8 GAMMA 

A MATHPAC function. 

6.3.9 WEGIT 

A MATHPAC function. 
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6.4 XNFUT 


Program input comes in through one namelist (see table J). The format is as 
follows : 

Card Column 12 

$INPUT 

- *' ((Parameter definitions)) 

$END 

6.5 OUTPUT 

Tables K> L. Mt and N are the four reports output by the model. 

Table K is the listing of input parameters from namelist $INPUT. 

(All rate parameters are expressed in terms of hours of effective productive 
time.) 

In table L, the parameters are as follows: 

SUP Rate: SUP hours per hour 

RUN Rate: Runs per hour 
CPU Rate: CPU hours per hour 
QUEUE Rate; QUEUE hours per hour 

VOLDEL Rate: Voluntary delay hours per hour as estimated by the model. 

INVOL Rate: Involuntary delay hours per hour as estimated by the model. 
ELAPSE Rate: Elapsed hours per hour as estimated by the model. 

VOLDEL Rate (A): Actual voluntary delay hours per hour pro-rated for the 

run rate. 

IMVOL Rate (A); Actual involuntary delay hours per hour pro-rated for the 
run rate. 

ELAPSE Rate (A) ; Actual elapsed hours per hour pro-rated for the run rate. 
TAPMNT Delay; Involuntary delay minutes per tape mount. 

BATCH QUEUE Rate: Batch queue hours per hour. 

The diagnostic "QUEUE SATURATION" indicates that a queue has saturated. The 
following two lines indicate the values of the various queues when saturation 
occurred. In this case, the SWAP or memory queue saturated first and was set 
to “1. 

The values for actual voluntary delay, involuntary delay and elapsed time are 
included for comparison only. This comparison is the sole purpose of inputting 
these parameters. They are not used in model estimates. The actual values are 
developed on a pro rata basis and are meaningful only in the neighborhood of the 
actual run level for benchmark tests. For purely hypothetical workloads, they 
have little or no meaning. Likewise, the minutes-per- tape-mount is valid only 
in the actual run level neighborhood since it is calculated from actual invol- 
untary delay. 

In table M, the parameters are as follows: 

SUP Rate: same as above. 

RUN Rate; same as above. 

TOTAL Queue; same as above. 

CPU Queue: CPU queue hours per hour. 

MEMORY Queue: Memory queue hours per hour 

(continued) 
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I/O QUEUE: I/O queue hours for all device types. 

I/O iQUEUE: I/O queue hours per hour for device type i. The report is for- 
matted for only five device types. 

In table N» the parameters are as follows: 

SUP Rate: same as above. 

TIME TO SWAP: The time required to accomplish swapping activity (experimental). 
CPU UTIL: Percent of time CPU produces billable service. 

SWAP Rate: Swaps per hour (experimental). 

MEMORY UTIL: Average number of core blocks required for resident, busy runs. 
Resident, delayed runs are excluded. 

PERCENT SATUFATION: The ratio of current- line SUP rate to that at saturation. 

6.6 FILE ASSIGiniENTS 

All input is read from the standard input file "READ$" equated to logical unit 
number 5 in the FORTRAN source code. 

All reports are written to the standard print file PRINT$, FORTRAN logical unit 6. 

Intermediate unformatted output is written to a sequential file named "25". This 
file is dynamically assigned to mass storage. 

6.7 PROGRAM EXECUTION 

Program execution is accomplished by the following setup: 

Card Column 12 

(?RUN 

@CQT 

$INPUT 

((input parameters)) 

$END 

@FIN 

The program requires a total main memory allocation of about 12K decimal words . 

A typical execution requires between one and two minutes of CPU time. 


7.0 PROGRAM LISTING 

See Figure 11 for the program listing. 
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THIS PROGRAM CALCULATES AN ELAPSED TIKE PROFILE FOR 
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CALCULATE BATCH QUEUE AND ADJUST SWAP QUEUE 
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WRITE ADDITIONAL OUTPUT ON ALTERNATE FILE 


WRITE 125) SUP RAT,RUNRAT,TOTw,CPUO,SWAPQ,TRAFQ riTRAFIQ(I) tlri^NUMU 

— IN T T 7 5UP R A r; tih swpTswoppTcrrxc PirruTCMrK 

SUPRATrSUPRAT^ ,02 3 INCREMENT THE SUP RATE 

” — m; 0 ^T 0 "3 3 CTTLCULATE ANOTHER DATA" PO TNT 

120 WRITE C6,130) SWAPQ »CPUe , < TRAF IQ < iJ , I = 1 ,NUMUNT ) 


i r/0’r/faXf7FX0.7)l 

S A T r at:: S UFR AT--^ 2^ oi KE'CXPT UPL“ THE S ATUR-ATIOfmSUP ' H^TTE 



























O Cl 




























u p 


OOOD R 000173 RALPH 


CPOO R 000067 PB 


UO I 000167 lY DOCC 1 OCOCCl K OCCO I 000153 N 

ocnr-tJ!:00a5~p'D — — ucna-R' 'ooc cot" pha'i 00Gcnn30CT7 fPHiT's ootnnnrctjrm’fn CTim)"R'Dooibi plah 

coco R C00162 PR CCOO R 000164 PROD COCO R 000165 PROOl 0000 R 000163 PT 0000 P 000171 CHAT 

■nooxTr-ccoi 72“ciNT s ooDcrR-DC0i5T4— SDH cooc-R-0Dci5Z~suFnjo DTJOcr R -Dcoocp* CA TTS cnotnn3oaoo3"H-An 

0000 R 000157 Y 0000 R 000166 YHAT 0000 R 000160 Yl 









































ionuc 



















cool ooom** SOL 0000 COOCll 6CF 0000 R OOQOi )2 CC 0000 R OOQOIO CCC 0000 R OOOOOH G 

ococmiaoooTT cooD~r~oocrccm c — — crooti c 00021 in'jfs oouo r oiiuuoi' ho 0000 r oddods suh" 

0000 R 000005 TOP 0000 R OCOCOC UAIT • 
























6AHHA 

VEOIT 

WftIT 


QUeuE 

PH/IT 

TMSWAP 


HEiHUTL 


V(3I 

irii 


Sil) 


SMI 


sill 


sill 


SMI 


uil 


DELAYS SMI 


OQQQQQ 
016702 017071 


017072 017?6T 


017270 01 7«l71 


UI7472 0175S1 


01 7552 017631 


C17632 020230 


020231 020245 


020246 020274 


sroi 

TTZr 

S(CI 

TX7T 

1(01 

TTTT 

SI or 

TT2T 
1(01 
TTZT 
1 10 1 
TTZT 
KOI 
TT7T 
KOI 
HTT 


045222 04525 7 
'FomacoHHoir” 
045260 045301 
BUKK‘VC0HH0V“" 
045302 0 4 5335 

045336 045344 
TOTFjlvSCOHKOTI — 
0453'-5 04536 7 
~EUI<n]7StXfPi}\Xil^ 
04S370 045633 
BLAM:3XOHHDPr" 
04S6:;4 04564 0 
■BLTWrSXOflHON~ 
045641 045651 
^iSNinOOMMON 


BL ANKSCOH WON IG0HM0N8L0 CK I 


II' 



I 





